Code
library(knitr)
library(targets)
library(MiscMetabar)
here::i_am("analysis/01_bioinformatics.qmd")Where we see the pipeline processes
Date: October 28, 2024
library(knitr)
library(targets)
library(MiscMetabar)
here::i_am("analysis/01_bioinformatics.qmd")d_pq <- tar_read("d_vs", store=here::here("_targets/"))tar_glimpse(script=here::here("_targets.R"), targets_only = TRUE, callr_arguments = list(show = FALSE))d_pq <- tar_read("d_vs", store=here::here("_targets/"))The {targets} package is at the core of this project. Please read the intro of the user manual if you don’t know {targets}.
The {targets} package store … targets in a folder and can load (tar_load()) and read (tar_read) object from this folder.
DT::datatable(d_pq@sam_data)formattable_pq(
d_pq,
"Type",
min_nb_seq_taxa = 1000,
taxonomic_levels=c("Order", "Family", "Genus"),
log10trans = TRUE
)Cleaning suppress 0 taxa ( ) and 0 sample(s) ( ).
Number of non-matching ASV 0
Number of matching ASV 1147
Number of filtered-out ASV 1135
Number of kept ASV 12
Number of kept samples 2
Cleaning suppress 0 taxa and 0 samples.
Joining with `by = join_by(OTU)`
| OTU | Order | Family | Genus | Mono | proportion_samp | nb_seq |
|---|---|---|---|---|---|---|
| Taxa_2 | Glomerales | Claroideoglomeraceae | Claroideoglomus | 4.36 | 1 | 4.36 |
| Taxa_12 | Glomerales | NA | NA | 3.85 | 1 | 3.85 |
| Taxa_21 | NA | NA | NA | 3.64 | 1 | 3.64 |
| Taxa_3 | Glomerales | Glomeraceae | Glomus | 3.61 | 1 | 3.61 |
| Taxa_5 | Glomerales | Glomeraceae | Glomus | 3.48 | 1 | 3.48 |
| Taxa_17 | Glomerales | Glomeraceae | Glomus | 3.42 | 1 | 3.42 |
| Taxa_15 | NA | NA | NA | 3.39 | 1 | 3.39 |
| Taxa_13 | Glomerales | Claroideoglomeraceae | Claroideoglomus | 3.38 | 1 | 3.38 |
| Taxa_23 | Glomerales | Claroideoglomeraceae | Claroideoglomus | 3.32 | 1 | 3.32 |
| Taxa_19 | NA | NA | NA | 3.23 | 1 | 3.23 |
| Taxa_41 | Glomerales | Glomeraceae | Glomus | 3.14 | 1 | 3.14 |
| Taxa_42 | Glomerales | NA | NA | 3.05 | 1 | 3.05 |
kable(tar_read(track_sequences_samples_clusters, store=here::here("_targets/")))| nb_sequences | nb_clusters | nb_samples | |
|---|---|---|---|
| Paired sequences | 20790310 | 5610 | 452 |
| Paired sequences without chimera | 20503830 | 3746 | 452 |
| Paired sequences without chimera and longer than 200bp | 20495636 | 3708 | 452 |
| ASV denoising | 67803 | 3708 | 2 |
| OTU after vsearch reclustering at 97% | 67803 | 1147 | 2 |
| OTU vs after mumu cleaning algorithm | 67803 | 31 | 2 |
| OTU vs + mumu + rarefaction by sequencing depth | 4000 | 28 | 2 |
tab_samp <- tar_read(track_by_samples, store=here::here("_targets/"))
for (li in names(tab_samp)) {
print(knitr::kable(tab_samp[[li]], caption = li, format="html"))
cat('\n<!-- -->\n\n')
}| nb_sequences | nb_clusters | nb_samples | |
|---|---|---|---|
| ASV denoising | 33764 | 96 | 1 |
| OTU after vsearch reclustering at 97% | 33764 | 55 | 1 |
| OTU vs after mumu cleaning algorithm | 33764 | 19 | 1 |
| OTU vs + mumu + rarefaction by sequencing depth | 2000 | 14 | 1 |
| nb_sequences | nb_clusters | nb_samples | |
|---|---|---|---|
| ASV denoising | 34039 | 113 | 1 |
| OTU after vsearch reclustering at 97% | 34039 | 76 | 1 |
| OTU vs after mumu cleaning algorithm | 34039 | 28 | 1 |
| OTU vs + mumu + rarefaction by sequencing depth | 2000 | 24 | 1 |
Session information are detailed below. More information about the machine, the system, as well as python and R packages, are available in the file data_final/information_run.txt .
sessionInfo()R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Debian GNU/Linux 12 (bookworm)
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.11.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0
locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Paris
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] MiscMetabar_0.10.1 purrr_1.0.2 dplyr_1.1.4 dada2_1.32.0
[5] Rcpp_1.0.13 ggplot2_3.5.1 phyloseq_1.48.0 targets_1.8.0
[9] knitr_1.48
loaded via a namespace (and not attached):
[1] bitops_1.0-9 deldir_2.0-4
[3] gridExtra_2.3 permute_0.9-7
[5] rlang_1.1.4 magrittr_2.0.3
[7] ade4_1.7-22 matrixStats_1.4.1
[9] compiler_4.4.1 mgcv_1.9-1
[11] png_0.1-8 callr_3.7.6
[13] vctrs_0.6.5 reshape2_1.4.4
[15] stringr_1.5.1 pwalign_1.0.0
[17] pkgconfig_2.0.3 crayon_1.5.3
[19] fastmap_1.2.0 backports_1.5.0
[21] XVector_0.44.0 utf8_1.2.4
[23] Rsamtools_2.20.0 rmarkdown_2.28
[25] UCSC.utils_1.0.0 ps_1.8.0
[27] xfun_0.48 cachem_1.1.0
[29] zlibbioc_1.50.0 GenomeInfoDb_1.40.1
[31] jsonlite_1.8.9 biomformat_1.32.0
[33] highr_0.11 rhdf5filters_1.16.0
[35] DelayedArray_0.30.1 Rhdf5lib_1.26.0
[37] BiocParallel_1.38.0 jpeg_0.1-10
[39] parallel_4.4.1 cluster_2.1.6
[41] R6_2.5.1 bslib_0.8.0
[43] RColorBrewer_1.1-3 stringi_1.8.4
[45] jquerylib_0.1.4 GenomicRanges_1.56.2
[47] SummarizedExperiment_1.34.0 iterators_1.0.14
[49] IRanges_2.38.1 Matrix_1.7-0
[51] splines_4.4.1 igraph_2.1.1
[53] tidyselect_1.2.1 viridis_0.6.5
[55] rstudioapi_0.17.1 abind_1.4-8
[57] yaml_2.3.10 vegan_2.6-8
[59] codetools_0.2-20 hwriter_1.3.2.1
[61] processx_3.8.4 lattice_0.22-6
[63] tibble_3.2.1 plyr_1.8.9
[65] Biobase_2.64.0 withr_3.0.1
[67] ShortRead_1.62.0 evaluate_1.0.1
[69] survival_3.7-0 RcppParallel_5.1.9
[71] formattable_0.2.1 Biostrings_2.72.1
[73] pillar_1.9.0 BiocManager_1.30.25
[75] MatrixGenerics_1.16.0 DT_0.33
[77] renv_1.0.11 foreach_1.5.2
[79] stats4_4.4.1 generics_0.1.3
[81] rprojroot_2.0.4 S4Vectors_0.42.1
[83] munsell_0.5.1 scales_1.3.0
[85] base64url_1.4 glue_1.8.0
[87] tools_4.4.1 interp_1.1-6
[89] data.table_1.16.2 GenomicAlignments_1.40.0
[91] visNetwork_2.1.2 rhdf5_2.48.0
[93] grid_4.4.1 tidyr_1.3.1
[95] ape_5.8 crosstalk_1.2.1
[97] latticeExtra_0.6-30 colorspace_2.1-1
[99] nlme_3.1-165 GenomeInfoDbData_1.2.12
[101] cli_3.6.3 fansi_1.0.6
[103] viridisLite_0.4.2 S4Arrays_1.4.1
[105] gtable_0.3.5 sass_0.4.9
[107] digest_0.6.37 BiocGenerics_0.50.0
[109] SparseArray_1.4.8 htmlwidgets_1.6.4
[111] htmltools_0.5.8.1 multtest_2.60.0
[113] lifecycle_1.0.4 here_1.0.1
[115] httr_1.4.7 secretbase_1.0.3
[117] MASS_7.3-61
@online{taudière2024,
author = {Taudière, Adrien},
title = {Bioinformatics Pipeline Summary},
date = {2024-10-28},
langid = {en}
}